i-Motif folding intermediates with zero-nucleotide loops are trapped by 2′-fluoroarabinocytidine via F···H and O···H hydrogen bonds

G-quadruplex and i-motif nucleic acid structures are believed to fold through kinetic partitioning mechanisms. Such mechanisms explain the structural heterogeneity of G-quadruplex metastable intermediates which have been extensively reported. On the other hand, i-motif folding is regarded as predictable, and research on alternative i-motif folds is limited. While TC5 normally folds into a stable tetrameric i-motif in solution, we report that 2′-deoxy-2′-fluoroarabinocytidine (araF-C) substitutions can prompt TC5 to form an off-pathway and kinetically-trapped dimeric i-motif, thereby expanding the scope of i-motif folding landscapes. This i-motif is formed by two strands, associated head-to-head, and featuring zero-nucleotide loops which have not been previously observed. Through spectroscopic and computational analyses, we also establish that the dimeric i-motif is stabilized by fluorine and non-fluorine hydrogen bonds, thereby explaining the superlative stability of araF-C modified i-motifs. Comparative experimental findings suggest that the strength of these interactions depends on the flexible sugar pucker adopted by the araF-C residue. Overall, the findings reported here provide a new role for i-motifs in nanotechnology and also pose the question of whether unprecedented i-motif folds may exist in vivo.


B. Supplementary Tables p18
Supp.  Low values indicate that the sugar conformation is in the South domain (P between 140º and 180º). The equivalent curve for deoxyribose (JH1′-H2′′) is shown in red for comparison. In the 2′F-arabino sugar, the electronegative substituent provokes a significant decrease of the 1 H-1 H J-coupling.

Supplementary
Supplementary Figure 14. Example of four-cytosine model system extracted from a MD dimeric i-motif structure and optimized. The backbone phosphate group was removed, the sugar 3′ ends are capped with a hydroxyl group, while the 5′ ends with a hydrogen atom.

Assignment of spikes in NCI plots
Six spikes are recognizable in the C6-C2′ NCI plot in the region sign(2)*  -0.01 au (Supplementary Figure 15B). Five of them are intra-residual, as two and three of them appear in C6 (Supplementary Figure 15C) and C2′ (Supplementary Figure 15D) NCI plots, respectively. The two spikes in the C6 plot can only be assigned to the isosurface between 2′F and H6, and between O2 and H1′, both displaying a s 0.013 isovalue spot. By observing the color of the 0.04 isosurface, we can assign the spikes at about -0.015 and -0.02 a.u. to 2′F···H6 and O2···H1′ interactions, respectively. Similarly, we can assign the O4′···H6 and the O2···H1′ of C2′ residue to the spikes at about -0.02 and -0.01 a.u., respectively. The third spike at -0.018 au in the C2′ plot is assigned to a spurious interaction involving the artificially introduced methyl group. Finally, the inter-residual C6 2′F···H4-2 C2′ interaction is assigned to the spike at about -0.01 au. On the other hand, the C2′ 2′F···H4-2 C6 is not recognizable in the NCI plot, as located in a more crowded and weaker-interaction region (0    -0.01 au). Similar results may be found by analysis of Supplementary Figures 17-19 for representative two-residue models C2-C6′, C3-C5′, and C5-C3′.

B. Supplementary Tables
Supplementary Table 1

Analysis of TH Profiles:
The TH profiles were globally fit to two different assembly models as follows: Sequential tetrameric assembly (Scheme A): Step-wise association of monomers (M), to dimers (D), to trimers (Tr) and finally to tetramers (T).

Sequential tetrameric assembly with a folded dimer (Scheme B):
Step-wise association of monomers (M), to dimers (D), to trimers (Tr) and finally to tetramers (T), with the possibility for the dimer (D) to collapse intramolecularly into a folded dimer (D * ).
The rate constants are assumed to be functions of temperature following an Arrhenius relationship. The temperature dependences of the rate constants are given by: where k0 is the rate constant at the reference temperature Tref and Ea is the activation energy. In the global fit of the TH profiles, the set of equations (1-4) were numerically integrated using the ordinary differential equation (ODE) solvers in MATLAB to obtain the concentrations of monomer, dimer, trimer, and tetramer as a function of temperature. The absorbance profiles of the monomer and tetramer were assumed to be linear with temperature. The sets of TH profiles were fit by varying the kinetic parameters to minimize the RSS between the experimental and fitted absorbance data according to = ∑( exp( ) − ( ,x) ) 2

(7)
where Absexp(T) and Abscalc(T) are the experimental and fitted absorbance profiles respectively, x = [k1, k-1, k2, k-2, k3, k-3, kF, kU, E1, E-1, E2, E-2, E3, E-3] are the rate constants at the reference temperature and activation energies governing assembly and disassembly of the tetramer. Both 50 μM and 250 μM profiles at 0.5 and 5°C/min were fit together. Errors for fitted parameters were calculated using a bootstrapping approach, 2 in which each bootstrap sample was obtained by random resampling of the original data. For example, if the original dataset contained N points, each bootstrap sample was constructed by randomly selecting N of these data points, such that points may be selected more than once or not at all. 500 bootstrap samples were constructed and fitted using the kinetic model described above. The errors in the extracted parameters were taken as the standard deviations of the 500 sets of parameters obtained for all bootstrap samples.